*The replication code produces results for all  tables in the paper. 
*There is three input data:one file is named "Determinants.dta" (for Table 2, 3 and 7), one file is named as "main_data.dta" (for Table 1 and 6), and another two files are named as "port.dta" and "ff_factors.dta" (for Table 4 and 5), and running the code produces the results of empirical analyese in the paper.
*ssc install reghdfe --- these packages should be installed to fit regressions with multiple layers of FEs
*ssc install ftools --- dependency for reghdfe package
*ssc install xtfmb --- these packages should be installed to fit Fama-Macbeth regressions


*****Use "Determinants.dta" to produce results in Table 2, 3 and 7. 
**Table 2 Validation check**
use "Determinants.dta",clear
foreach v in age_frac_year ethni_frac_year gender_frac_year{
reg `v' DiversityInc 
dis(_b[_cons])
dis(_b[_cons]+_b[DiversityInc])
}
foreach v in age_frac_year ethni_frac_year gender_frac_year{
 pwcorr  `v' DIV_score DIV_str_A DIV_str_B DIV_str_C DIV_str_D DIV_str_E DIV_str_F  DIV_con_A DIV_con_B DIV_con_C DIV_con_D ,star(0.05) 
}
foreach v in age_frac_year ethni_frac_year gender_frac_year{
pwcorr  `v' PolicyDiversityandOpportunity TargetsDiversityandOpportunit ,star(0.05) 
}
**Table 3 Determinants of diversity**
gen nonamerican=1-pct_american 
xtset gvkey year
reghdfe  f.age_frac_year  BM log_me lev asset_growth InstOwn_Perc  intan_at  log_ananumber , abs(sic2 year) cluster(company_id_deter) keepsingleton 
reghdfe  f.ethni_frac_year BM log_me lev asset_growth InstOwn_Perc  intan_at  log_ananumber, abs(sic2 year) cluster(company_id_deter)  keepsingleton 
reghdfe  f.gender_frac_year BM log_me lev asset_growth InstOwn_Perc  intan_at  log_ananumber, abs(sic2 year) cluster(company_id_deter)  keepsingleton 
reghdfe  f.age_frac_year  BM log_me lev asset_growth InstOwn_Perc  intan_at  log_ananumber std_age  pct_female nonamerican , abs(sic2 year) cluster(company_id_deter) keepsingleton 
reghdfe  f.ethni_frac_year BM log_me lev asset_growth InstOwn_Perc  intan_at   log_ananumber std_age  pct_female nonamerican , abs(sic2 year) cluster(company_id_deter)  keepsingleton 
reghdfe  f.gender_frac_year BM log_me lev asset_growth InstOwn_Perc  intan_at  log_ananumber std_age  pct_female nonamerican, abs(sic2 year) cluster(company_id_deter)  keepsingleton 
reghdfe  f.age_frac_year  BM log_me lev asset_growth InstOwn_Perc  intan_at   log_ananumber std_age  pct_female nonamerican EnvironmentalPillarScore SocialPillarScore GovernancePillarScore , abs(sic2 year) cluster(company_id_deter) keepsingleton 
reghdfe  f.ethni_frac_year BM log_me lev asset_growth InstOwn_Perc  intan_at   log_ananumber std_age  pct_female nonamerican EnvironmentalPillarScore SocialPillarScore GovernancePillarScore, abs(sic2 year) cluster(gcompany_id_deter)  keepsingleton 
reghdfe  f.gender_frac_year BM log_me lev asset_growth InstOwn_Perc  intan_at  log_ananumber std_age  pct_female nonamerican EnvironmentalPillarScore SocialPillarScore GovernancePillarScore, abs(sic2 year) cluster(company_id_deter)  keepsingleton
global cont1 BM log_me lev asset_growth InstOwn_Perc intan_at RET_sd
**Table 7 Future operating performance**
foreach v in  age_frac ethni_frac  gender_frac $cont1 {
drop if `v'==.
}
winsor2  $cont1 age_frac ethni_frac  gender_frac sale_emp  roe GrossProfit,cuts(1 99) replace 
gen lns=ln(sale_emp)
xtset company_id_deter year
foreach v in age_frac_year ethni_frac_year gender_frac_year{
 reghdfe  f.lns `v'  $cont  , abs(company_id_deter  year) cluster(company_id_deter ) keepsingleton
}
foreach v in age_frac_year ethni_frac_year gender_frac_year{
reghdfe  f.GrossProfit `v' $cont , abs(company_id_deter  year) cluster(company_id_deter ) keepsingleton
}
foreach v in age_frac_year ethni_frac_year gender_frac_year{
 reghdfe  f.roe `v' $cont , abs(company_id_deter  year) cluster(company_id_deter ) keepsingleton
}

****Table 4 and 5 portfolio analysis*****
*the process request the use of "port.dta" and "ff_factors.dta" 
*age_frac_year_current ethni_frac_year_current  gender_frac_year_current are use to replicate results in Table 4, in which diversity score is measured at t,
*and return is measured at t+1
*age_frac_year_lag1 (gender_frac_year_lag1, ethni_frac_year_lag1) indicates age (gender, ethnicity) diversity measured at t-1, and return is measured at t+1. This is used to produce results in the first panel of Table 5 ("Return t+1")
*age_frac_year_lag2 (gender_frac_year_lag2, ethni_frac_year_lag2)indicates age (gender, ethnicity) diversity measured at t-2, and return is measured at t+1. This is used to produce results in the second panel of Table 5 ("Return t+2")
*age_frac_year_lag3 (gender_frac_year_lag3, ethni_frac_year_lag3) indicates age (gender, ethnicity)  diversity measured at t-3, and return is measured at t+1. This is used to produce results in the third panel of Table 5 ("Return t+3")
*age_frac_year_lag4 (gender_frac_year_lag4, ethni_frac_year_lag4) indicates age (gender, ethnicity)  diversity measured at t-4, and return is measured at t+1. This is used to produce results in the fourth panel of Table 5 ("Return t+4")
*age_frac_year_lag5 (gender_frac_year_lag5, ethni_frac_year_lag5) indicates age (gender, ethnicity)  diversity measured at t-5, and return is measured at t+1. This is used to produce results in the fifth panel of Table 5 ("Return t+5")
use "port.dta",clear
foreach v in  age_frac_year_current ethni_frac_year_current  gender_frac_year_current age_frac_year_lag1 gender_frac_year_lag1 ethni_frac_year_lag1 age_frac_year_lag2 gender_frac_year_lag2 ethni_frac_year_lag2 age_frac_year_lag3 gender_frac_year_lag3 ethni_frac_year_lag3 age_frac_year_lag4 gender_frac_year_lag4 ethni_frac_year_lag4 ym_div age_frac_year_lag5 gender_frac_year_lag5 ethni_frac_year_lag5 {
drop if  `v'==.
winsor2   MV RET ,cuts(1 99) replace by(ym)
xtset stock_id ym
gen lag_MV=l.MV
gen exret=RET-rf
*form the portfolio into quartiles (j=4) based on diversity
local j =4
egen g1=xtile( `v'),by(ym) nq(`j')
duplicates drop stock_id ym,force
gen log_MV=log(lag_MV)
*for value-weighted portfolio
collapse (mean) exret  [aweight = lag_MV],by(ym g1)
*for equal-weighted portfolio, using this command instead
*collapse (mean) exret  ,by(ym g1)
reshape wide  exret,i(ym) j(g1)
merge m:1 ym using "ff_factors.dta"
drop if _merge==2
drop _merge
sort ym
ge time=_n
tsset time
*Skip the following commands from forval until lag(4)} for Table 5
forval i=1/`j'{
replace exret`i'=exret`i'*100
 newey exret`i' umd hml smb mktrf,lag(4) 
}
*create a spread portfolio
gen longshort=exret`j'-exret1
newey longshort umd hml smb mktrf,lag(4)
}




****Producing results of Table 1 and Table 6*****
use "main_data.dta"
**Table 1 Summary statistics**
tabstat  RET  age_frac_year gender_frac_year ethni_frac_year me  $cont , statistics(N mean  sd p25 p50 p75) column(statistics)
**Table 6 regressions**
*Perform Fama-Macbeth regression in columns 1, 2, and 3
xtfmb  RET age_frac_year $cont, lag(4)
xtfmb  RET gender_frac_year $cont, lag(4)
xtfmb  RET ethni_frac_year $cont, lag(4)
*Perform Panel OLS regression in columns 4, 5, and 6
reghdfe RET age_frac_year $cont, abs(company_id_main ym) cluster(company_id_main) keepsingleton
reghdfe RET gender_frac_year $cont, abs(company_id_main ym) cluster(company_id_main) keepsingleton
reghdfe RET ethni_frac_year $cont, abs(company_id_main ym) cluster(gvkey) keepsingleton
****Table 7 Robustness check*****
*Panel A - firms with data at the year of 2000
reghdfe RET age_frac_year $cont if merge2000==3, abs(company_id_main ym) cluster(company_id_main) keepsingleton
 reghdfe RET gender_frac_year $cont if merge2000==3, abs(company_id_main ym) cluster(company_id_main) keepsingleton
 reghdfe RET ethni_frac_year $cont if merge2000==3, abs(company_id_main ym) cluster(company_id_main) keepsingleton
*Panel B - firms with information over the entire period
 reghdfe RET age_frac_year $cont if mergefull==3, abs(company_id_main ym) cluster(company_id_main) keepsingleton
 reghdfe RET gender_frac_year $cont if mergefull==3, abs(company_id_main ym) cluster(company_id_main) keepsingleton
 reghdfe RET ethni_frac_year $cont if mergefull==3, abs(company_id_main ym) cluster(company_id_main) keepsingleton
 *Panel C - subperiods after 2011
reghdfe RET age_frac_year $cont if year>=2011, abs(company_id_main ym) cluster(company_id_main) keepsingleton
 reghdfe RET gender_frac_year $cont if year>=2011, abs(company_id_main ym) cluster(company_id_main) keepsingleton
 reghdfe RET ethni_frac_year $cont if year>=2011, abs(company_id_main ym) cluster(company_id_main) keepsingleton
*Panel D - change of diversity score
reghdfe RET change_age_frac_year $cont, abs(company_id_main ym) cluster(company_id_main) keepsingleton
reghdfe RET change_gender_frac_year $cont, abs(company_id_main ym) cluster(company_id_main) keepsingleton
reghdfe RET change_ethni_frac_year $cont, abs(company_id_main ym) cluster(company_id_main) keepsingleton 
*Panel E - industry-adjusted diverisyt score
foreach v in age_frac ethni_frac  gender_frac {
bys sic2 ym: center `v'
}
reghdfe RET c_age_frac_year $cont, abs(company_id_main ym) cluster(company_id_main) keepsingleton
reghdfe RET c_gender_frac_year $cont, abs(company_id_main ym) cluster(company_id_main) keepsingleton
reghdfe RET c_ethni_frac_year $cont, abs(company_id_main ym) cluster(company_id_main) keepsingleton
